Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Italicizing and superscripting in putdocx text | An easier way?

    Hello,

    I am using -putdocx- entirely for writing the results portion of my manuscript which is generously sprinkled with italicized gene names as well as superscripted letters (e.g. TP53SH) within italicized gene names. Ideally, I want the final docx coming out of the do file to contain these formatted characters.

    For one,
    1. I don't want to chop up putdocx statements for each instance of a formatted character into separate lines of code like I see in the help file.
    2. Nor do I want to add <<dyndocx>> code blocks around those characters, both of which looked cumbersome and make the putdocx text blocks very hard to read.
    My current solution is to tag them uniquely within the do file and then use a recorded MS Word macro to replace those tagged characters with Word Advanced Find/Replace which again is becoming cumbersome when I work with different computers at work and home. Is there possibly a way to programmatically define these replacements a priori within the do file and have them come out with the right formatting out of the do file?

    For instance, "TP53mutt" needs to be TP53MUT (the '-mut' suffix is hard to isolate in MS Word Find/replace and hence the 'mutt' tag) TP53mut would too non-specific for Find/Replace. I have several such words and letters (TP53shit, TP53misn, TP53nmis for single-hit, missense and non-missense mutations). Would appreciate any pointers for a bespoke code block.

    Code:
    *1 Clear putdocx file from memory, if any.
    putdocx clear
    
    *3 Start the putdocx file
    // Create a document with a header
    putdocx begin, header(head)
    
    // Define the header content, and include page numbers
    putdocx paragraph, toheader(head) font(,14)
    putdocx text ("Results: Outcome analysis")
    
    **# Results 1. Cohort summary*--------------------------
    
    
    putdocx paragraph, style(Heading1) font("arial", 13, black) halign(both)
    putdocx text ("Baseline cohort outcome summary")
    *--------------------------------------------------------------------
    putdocx paragraph, font("arial", 11, black) spacing(line, 22pt)
    putdocx text ("The median duration of follow up from diagnosis of TP53mutt myeloid neoplasm to study exit (censoring or death) was......")
    
    
    gv p53 \\ program to step out of pwd to get that image 
    putdocx image ".\Figure_02.tiff", width(2)
    
    putdocx save testing.docx, replace

  • #2
    Maybe using Unicode superscript letters:

    Code:
    local TP53mut = "TP53" + ustrunescape("\u1D39\u1D41\u1D40")
    mac list _TP53mut
    Code:
    _TP53mut:       TP53ᴹᵁᵀ

    Comment


    • #3
      and, you could do some search replace pre-processing to build putdocx commands

      Code:
      ********************************************************************************
      // step one define locals to be used in code  
      
      local TP53mut = "TP53" + ustrunescape("\u1D39\u1D41\u1D40")
      
      // make example do file
      
      local EOL = char(10)
      
      #delim ; 
      
      scalar code = 
      
      `" 
          *1 Clear putdocx file from memory, if any. `EOL'
          putdocx clear `EOL'
           `EOL'
          *3 Start the putdocx file `EOL'
          // Create a document with a header `EOL'
          putdocx begin, header(head) `EOL'
           `EOL'
          // Define the header content, and include page numbers `EOL'
          putdocx paragraph, toheader(head) font(,14) `EOL'
          putdocx text ("Results: Outcome analysis") `EOL'
           `EOL'
          **# Results 1. Cohort summary*-------------------------- `EOL'
           `EOL'
           `EOL'
          putdocx paragraph, style(Heading1) font("arial", 13, black) halign(both) `EOL'
          putdocx text ("Baseline cohort outcome summary") `EOL'
          *-------------------------------------------------------------------- `EOL'
          putdocx paragraph, font("arial", 11, black) spacing(line, 22pt) `EOL'
          putdocx text ("The median duration of follow up from diagnosis of `TP53mut' myeloid neoplasm to study exit (censoring or death) was......") `EOL'
           `EOL'
          putdocx save testing.docx, replace `EOL'
      "'  
      ;
      #delim cr
      
      tempfile code1
      di filewrite("`code1'", scalar(code))
      
      ********************************************************************************
      
      // step 2. search replace to build putdocx commands   
      
      clear 
      scalar code = fileread("`code1'")
      
      scalar code = subinstr(code, "TP53ᴹᵁᵀ", ///
                              char(34) + ")" ///
                              + char(10) ///
                              + "putdocx text (" ///
                              + char(34) + "TP53ᴹᵁᵀ" + char(34)  ///
                              + ")" + ", italic" ///
                              + char(10) + "putdocx text (" + char(34), ///
                              .)
      tempfile code2
      di filewrite("`code2'", scalar(code) )
      do `code2'
      
      exit
      Baseline cohort outcome summary

      The median duration of follow up from diagnosis of TP53ᴹᵁᵀ myeloid neoplasm to study exit (censoring or death) was......

      Comment


      • #4
        Wow....thanks much Bjarte Aagnes
        Definitely seems like I am willing to invest the time to define all my unicodes in the first step, possibly as an include file as Clyde Schechter showed me in a earlier post related to superscripting value labels. Although admittedly, I am still trying to wrap my mind around the logic of your code (need to read up on the asc and -ustrunescape- more to understand it). Meanwhile, I have a couple of follow up queries:
        1. How did you get that superscripted TP53MUT in the -subinstr- code line within the do file?
        2. I tried adding one more entry of another superscripted character to your code block. But I cannot get the superscript to show up.
        3. Is the way I am adding the extra entry correct based on the code below? If so, I will build up on this.
        Code:
        ********************************************************************************
        // step one define locals to be used in code  
        
        local TP53mut = "TP53" + ustrunescape("\u1D39\u1D41\u1D40")
        local TP53sh  = "TP53" + ustrunescape("\u0053\u0048")
        
        // make example do file
        
        local EOL = char(10)
        
        #delim ; 
        
        scalar code = 
        
        `" 
            *1 Clear putdocx file from memory, if any. `EOL'
            putdocx clear `EOL'
             `EOL'
            *3 Start the putdocx file `EOL'
            // Create a document with a header `EOL'
            putdocx begin, header(head) `EOL'
             `EOL'
            // Define the header content, and include page numbers `EOL'
            putdocx paragraph, toheader(head) font(,14) `EOL'
            putdocx text ("Results: Outcome analysis") `EOL'
             `EOL'
            **# Results 1. Cohort summary*-------------------------- `EOL'
             `EOL'
             `EOL'
            putdocx paragraph, style(Heading1) font("arial", 13, black) halign(both) `EOL'
            putdocx text ("Baseline cohort outcome summary") `EOL'
            *-------------------------------------------------------------------- `EOL'
            putdocx paragraph, font("arial", 11, black) spacing(line, 22pt) `EOL'
            putdocx text ("The median duration of follow up from diagnosis of `TP53mut' myeloid neoplasm to study exit (censoring or death) was......TP53sh did well...") `EOL'
             `EOL'
            putdocx save testing.docx, replace `EOL'
        "'  
        ;
        #delim cr
        
        tempfile code1
        di filewrite("`code1'", scalar(code))
        
        ********************************************************************************
        
        // step 2. search replace to build putdocx commands   
        
        clear 
        scalar code = fileread("`code1'")
        
        scalar code = subinstr(code, "TP53ᴹᵁᵀ", ///
                                char(34) + ")" ///
                                + char(10) ///
                                + "putdocx text (" ///
                                + char(34) + "TP53ᴹᵁᵀ" + char(34)  ///
                                + ")" + ", italic" ///
                                + char(10) + "putdocx text (" + char(34), ///
                                .)
        scalar code = subinstr(code, "TP53SH", ///
                                char(34) + ")" ///
                                + char(10) ///
                                + "putdocx text (" ///
                                + char(34) + "TP53SH" + char(34)  ///
                                + ")" + ", italic" ///
                                + char(10) + "putdocx text (" + char(34), ///
                                .)                        
        tempfile code2
        di filewrite("`code2'", scalar(code) )
        do `code2'
        
        exit

        Comment


        • #5
          Seem S, X, Z is not available: https://rupertshepherd.info/resource...ers-in-unicode
          Thus, using superscript lower case s below:
          Code:
          local TP53mut = "TP53" + ustrunescape("\u1D39\u1D41\u1D40")
          local TP53sh  = "TP53" + ustrunescape("\u02E2\u1D34")
          In the do file use the locals:

          "The median duration of follow up from diagnosis of `TP53mut' myeloid neoplasm to study exit (censoring or death) was......`TP53sh' did well..."

          Then, search/replace and run:
          Code:
          // step 2. search replace to build putdocx commands   
          
          clear 
          scalar code = fileread("`code1'") // replace with name of do file
          
          foreach genename in TP53mut TP53sh {
          
          scalar code = subinstr(code, "``genename''", ///
                                  char(34) + ")" ///
                                  + char(10) ///
                                  + "putdocx text (" ///
                                  + char(34) + "``genename''" + char(34)  ///
                                  + ")" + ", italic" ///
                                  + char(10) + "putdocx text (" + char(34), ///
                                  .)
          }
          
          tempfile code2    
          di filewrite("`code2'", scalar(code) )
          do `code2'
          Baseline cohort outcome summary

          The median duration of follow up from diagnosis of TP53ᴹᵁᵀ myeloid neoplasm to study exit (censoring or death) was......TP53ˢᴴ did well...

          Comment


          • #6
            That worked perfectly with the -foreach- loop. Also, I finally got the logic of what you were doing (splitting each instance of a `localized gene' into its own -putdocx- statement line) when I looked at the Results window. I will certainly find this useful for italicizing all the gene names once I create an include .do file.

            Too bad the superscript 'S' does not exist in unicode; it is too common a letter. It looks like I might have to resort to replacing it in MS Word for now. Which is still better that recording a bunch of VB macro replacements.

            Comment


            • #7
              one can use the second step to parse the name and add italics to both parts and superscript only to the second part. Below lower case will be the last part, and made to uppercase, italics and superscript.

              Code:
              local TP53mut = "TP53" + "mut" // ustrunescape("\u1D39\u1D41\u1D40")
              local TP53sh  = "TP53" + "sh"  // ustrunescape("\u02E2\u1D34")
              Code:
              // step 2. search replace to build putdocx commands   
              
              scalar code = fileread("`code1'") // replace with name of do file
              
              foreach    genename in TP53mut TP53sh   {
                      
                  if ( ustrregexm("`genename'", "(^[A-Z0-9]+)(\p{Lowercase_Letter}{1,})$") ) {
                      
                      local  first = ustrregexs(1)
                      local  second = ustrupper(ustrregexs(2))
                  }
                  
                  scalar code = subinstr(code, "``genename''",               ///
                                      char(34) + ")"                         /// first part:
                                      + char(10)                             ///
                                      + "putdocx text ("                     ///
                                      + char(34) + "`first'" + char(34)      ///
                                      + ")" + ", italic"                     ///
                                      + char(10)                             /// second part   
                                      + "putdocx text ("                     ///
                                      + char(34) + "`second'" + char(34)     ///
                                      + ")" + ", italic script(super)"       ///
                                      + char(10) + "putdocx text ("          ///
                                      + char(34)                             ///
                                      , ///
                                      . ///
                                      )
              }
              Baseline cohort outcome summary

              The median duration of follow up from diagnosis of TP53MUT myeloid neoplasm to study exit (censoring or death) was......TP53SH did well...



              Complete example code:

              Code:
              ********************************************************************************
              // step one define locals to be used in code  
              
              local TP53mut = "TP53" + ustrunescape("\u1D39\u1D41\u1D40")
              local TP53sh  = "TP53" + ustrunescape("\u02E2\u1D34")
              
              local TP53mut = "TP53" + "mut" // ustrunescape("\u1D39\u1D41\u1D40")
              local TP53sh  = "TP53" + "sh"  // ustrunescape("\u02E2\u1D34")
              
              // make example do file
              
              local EOL = char(10)
              
              #delim ; 
              
              scalar code = 
              
              `" 
                  *1 Clear putdocx file from memory, if any. `EOL'
                  putdocx clear `EOL'
                   `EOL'
                  *3 Start the putdocx file `EOL'
                  // Create a document with a header `EOL'
                  putdocx begin, header(head) `EOL'
                   `EOL'
                  // Define the header content, and include page numbers `EOL'
                  putdocx paragraph, toheader(head) font(,14) `EOL'
                  putdocx text ("Results: Outcome analysis") `EOL'
                   `EOL'
                  **# Results 1. Cohort summary*-------------------------- `EOL'
                   `EOL'
                   `EOL'
                  putdocx paragraph, style(Heading1) font("arial", 13, black) halign(both) `EOL'
                  putdocx text ("Baseline cohort outcome summary") `EOL'
                  *-------------------------------------------------------------------- `EOL'
                  putdocx paragraph, font("arial", 11, black) spacing(line, 22pt) `EOL'
                  putdocx text ("The median duration of follow up from diagnosis of `TP53mut' myeloid neoplasm to study exit (censoring or death) was......`TP53sh' did well...") `EOL'
                   `EOL'
                  putdocx save testing.docx, replace `EOL'
              "'  
              ;
              #delim cr
              
              tempfile code1
              di filewrite("`code1'", scalar(code))
              
              ********************************************************************************
              
              // step 2. search replace to build putdocx commands   
              
              scalar code = fileread("`code1'") // replace with name of do file
              
              foreach    genename in TP53mut TP53sh   {
                      
                  if ( ustrregexm("`genename'", "(^[A-Z0-9]+)(\p{Lowercase_Letter}{1,})$") ) {
                      
                      local  first = ustrregexs(1)
                      local  second = ustrupper(ustrregexs(2))
                  }
                  
                  scalar code = subinstr(code, "``genename''",               ///
                                      char(34) + ")"                         /// first part:
                                      + char(10)                             ///
                                      + "putdocx text ("                     ///
                                      + char(34) + "`first'" + char(34)      ///
                                      + ")" + ", italic"                     ///
                                      + char(10)                             /// second part   
                                      + "putdocx text ("                     ///
                                      + char(34) + "`second'" + char(34)     ///
                                      + ")" + ", italic script(super)"       ///
                                      + char(10) + "putdocx text ("          ///
                                      + char(34)                             ///
                                      , ///
                                      . ///
                                      )
              }
              
              tempfile code2    
              di filewrite("`code2'", scalar(code) )
              do `code2'
              exit

              Comment


              • #8
                Love it, Bjarte Aagnes. Works perfectly without having to define using the -ustrunescape-. It is a pleasure to learn how to use scalars for such purposes, not to mention easy to read, well indented code which I will adapt and follow.

                Comment

                Working...
                X